Trimming metabolism

Here, I am experimenting with trimming down the GPP results and seeing how it changes the mean/median estimates for each lake

Boxplots - summer only versus un-trimmed

Most, but not all, lakes end up having higher GPP overall if we trim to summer. Note that Prairie Lake drops out because there are only about 2 weeks of data for this lake, and it doesn’t fall in the “summer” window.

Comparison across lakes

How does this change the distribution of GPP across lakes?

It shuffles things around a little bit, but overall our top 5 most/least productive lakes are the same lakes.

Top 5 least productive (untrimmed):

lakeName
Taupo
Feeagh
Jordan
Almberga
Simoncouche
Top 5 least productive (summer only):
lakeName
Taupo
Jordan
Feeagh
Almberga
Simoncouche
Top 5 least productive (extended summer):
lakeName
Taupo
Jordan
Feeagh
Almberga
Simoncouche

Top 5 most productive (untrimmed):

lakeName
Balaton
Mirror
PrairiePothole
Acton
Taihu
Top 5 most productive (summer only):
lakeName
PrairiePothole
Mirror
Taihu
Balaton
Acton
Top 5 most productive (extended summer):
lakeName
Balaton
Mirror
PrairiePothole
Taihu
Acton

Examine a few lakes more closely

I wanted to start by looking at the a few lakes where GPP was lower after trimming

Balaton. In this case, estimated GPP in the trimmed dataset is about half of the full dataset because we end up missing some high GPP days in late August/early September.

TheLoch. In this case, estimated GPP in the trimmed dataset is about half of the full dataset because we end up missing some high GPP days in late August/early September. Knowing this lake pretty well (one of my dissertation lakes), I’m surprised to see a late season peak. Given that these rates are still quite low, I’m comfortable trimming the data to < Sept 1.

Trout Lake. Overall this is a super unproductive lake, but with the Sept 1 cut-off we do see to miss part of a second “peak” in September.

Little Rock Lake. This is a NEON lake, where they used DO profilers rather than stationary DO sensors (among other continuity issues with their wind, temp sensors), so I ended up with a pretty low number of days.

Trout Bog. No discernable seasonal pattern here, but at least if we extend ‘summer’ out to Oct 1, we capture more of these relatively high GPP days.

Rotoiti. Summer and summer+ estimates are identical because estimates for the shoulder season in this southern hemisphere lake are missing (Nov1-Dec and April1-May1).

When trimming results in >> GPP

Next I wanted to pull out a few lakes where trimming to ‘summer’ results in much higher estimates.

Kentucky. Since we have an entire year of data in this lake, trimming by date (June1-Sept1) does a pretty good job of characterizing “peak” productivity, though we do miss the rising and falling limbs. If we add in the shoulder season (“+”) we get a bit more days (still missing some relatively high GPP days), which brings down the mean/median.

Prairie Pothole This is a NEON lake, where they used DO profilers rather than stationary DO sensors (among other continuity issues with their wind, temp sensors), so I ended up with a pretty low number of days. GPP is highly variable. Note that whether or not we trim the data, this lake is still one of the top 5 most productive in the dataset, FWIW. This range is still huge, though, and given the high number of missing values I’m inclined to throw away this lake.

Annie. Similar case to Kentucky, where the estimates go up quite a bit because we are cutting out the least productive times of year. We do end up missing some relatively high GPP days with the Sept 1 cut-off, however. Adding in the shoulder seasons helps capture a few more relatively high GPP days, but overall brings down the mean a tiny bit.

Taihu. Similarly, by excluding Oct-January we are capturing the most productive time of year in this lake.

Cutoff for # of days missing

“Summer” is 92 days long and “summer+”/“extended summer” is 153 days long.

Whether we consider just “summer” (June1-Sept1) or “summer+” (May1-Oct1), we end up with the same few lakes as problematic in terms of number of days missing. If we use 50% as a cut-off, these lakes are Balaton (borderline), Barco, PrairieLake, Prairie Pothole, and Rotoiti

Cut-offs for lakes with lots of data